DHMM Based Automatic Language Identification System
نویسنده
چکیده
This paper focuses on the implementation of automatic language identification system (LID). Automatic language identification is a task to classify unknown utterance of speech into listed languages. LID is implemented using discrete hidden Markov models (DHMM). This system involves two phases and they are training phase and testing phase. In training phase, a common code book of MFCC features is created from huge speech corpus of all listed languages. Language specific DHMMs are created one for each language. In testing phase, MFCC feature are extracted from unknown speech and evaluated against each created DHMM. The language is hypothesized as identified language based on the likelihood value of sequence of feature vectors of unknown speech. The OGI database is used for the study. Even though we have used simple and easy method, the results are very impressive.
منابع مشابه
Automatic language identification using discrete hidden Markov model
In the recent automatic language identification research, phonotactic approach has been studied in which all training utterances are passed through a tokenizer in order to get phonetic sequences to train the language model of different languages. The true transcription of the utterances was totally ignored. However, information in the transcription may possess important discriminating power for...
متن کاملText Independent Language Recognition using Dhmm
Spoken Language Identification is a task of recognizing the language from an unknown utterance of speech. The ability of machines to distinguish between different languages becomes an important concern with the emerging trends in global communications which are multilingual nature. This paper describes a text independent language recognition system using a common code book and discrete hidden M...
متن کاملA Comparison of DHMM and DTW for Isolated Digits Recognition System of Arabic Language
Abstract— Despite many years of concentrated research, the performance gap between automatic speech recognition (ASR) and human speech recognition (HSR) remains large. Especially for Arabic language, research efforts are still limited in comparison with other languages such as English or Japanese. In this work, we have use two algorithms to implement a system of Automatic Recognition of isolate...
متن کاملTransformation of Hand-Shape Features for a Biometric Identification Approach
The present work presents a biometric identification system for hand shape identification. The different contours have been coded based on angular descriptions forming a Markov chain descriptor. Discrete Hidden Markov Models (DHMM), each representing a target identification class, have been trained with such chains. Features have been calculated from a kernel based on the HMM parameter descript...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012